On a Classical Method for Computing Eigenvectors
نویسنده
چکیده
One of the oldest methods for computing an eigenvector of a matrix F is based on the solution of a set of homogeneous equations which can be traced back to the times of Cauchy The principal di culty of this approach was identi ed by Wilkinson We remove this obstacle and analyse the viability of this classical method The key to the analysis is provided by the reciprocals of the diagonal elements of the inverse of the matrix F I where is a shift approximating an eigenvalue The nal missing link is a perturbation result due to Sherman and Morrison who proved that F fF gj ieie j is singular We extend this result to the block case Finally we give a new impetus for Rayleigh quotient and Laguerre iterations Introduction and Summary An eigenvector x corresponding to an eigenvalue of a general dense matrix F can be obtained by solving the homogeneous system of equations F I x The conventional method of solving homogeneous equations is to obtain the solution in terms of one of the unknowns say the kth element of x xk which could be set to unity without loss of generality This method was in fact used by Cauchy in in the same paper in which the famous interlace theorem was derived Perhaps this was the method of choice for computing eigenvectors at that time this paper preceded the method discovered by Jacobi for symmetric matrices by sixteen years In vibration analysis where the matrix or the matrix pencil is tridiagonal the dropping of the nth or the rst equation has been traditional since the beginning of this century See Holzer In exact arithmetic with perfect knowledge of the eigenvalue ideal results can be obtained by choosing almost any k The obvious and convenient choice for k is either or n In Wilkinson showed that in nite precision computation the choice of k is crucial an arbitrary k can give extremely bad results and the best k for real symmetric matrices is determined by the maximal jxkj Unfortunately this observation cannot be used directly to nd the eigenvector x since the maximal element is not known in advance See also Sections to Chapter of The same criterion has been known to practising mechanical engineers who choose the best k by physically observing the part of rotating machinery e g ywheels rotors cylinders attached to crankshafts which has the largest amplitude of vibration at a given frequency mode See Bishop et al page One of the keys to our analysis is the observation by Sherman and Morrison that the matrix F i jeie j is singular if i j e jF ei Here and throughout this paper ek is the unit vector with a one in the kth place and zeros elsewhere This scalar quantity i j sometimes appears in papers concerned with condition number estimation but its importance in eigenproblems has not been fully recognised We generalise this result to the block case and apply it to the eigenvector problem In nite precision arithmetic the eigenvalue is in general not known exactly and is represented by an approximate value shift We show that the mathematical problem of nding the best k is identical to that of nding the smallest scalar k k which makes the matrix F I k k eke k singular Note that the perturbation matrix k k eke k has a solitary non zero element k k at the kth diagonal element Thus the modulus j k k j may be considered as the distance to singularity at the kth diagonal element of F I Rutishauser appears to be the rst to use the k k in eigenproblems He introduced them in the context of di erential qd algorithms for eigenvalues of tridiagonal matrices See Section A of Fernando and Parlett extended the results to di erential qd algorithms for singular values of bidiagonal matrices and SVD of triangular matrices via the implicit Cholesky LR algorithm where the k k appear as intermediary quantities denoted by dk We study the nearly homogeneous system of equations de ned by F I z k k k ek zk where F is any square general matrix is a shift equal to an eigenvalue or a good esti mate of F and k k is a scalar which is usually tiny Then z k is the approximate eigenvector corresponding to the approximate eigenvalue Of course if k k is zero then z is an exact eigenvector in practice for best results k k should be as tiny as possible Suppose F has linear elementary divisors so that F X Y with diagonal and Y X we show that k k j yj kxk j as j where the product yj kxk j is invariant if j is a simple eigenvalue althoughX is not unique If the matrix F is normal and X is taken to be unitary then this product becomes jxk j j This rea rms and extends the criterion of Wilkinson that k should be chosen such that jxkj is maximal for real symmetric matrices It is well known in the context of inverse iterations that the shift should be as close as possible to the eigenvalue it approximates We have exploited the extra variable the index k to maximise the component of the desired eigenvector in the computed solution z Of course Wilkinson located the rst part of the jigsaw but we have been able to nd and assemble all the missing links There are other dividends from this discovery It is possible to execute Rayleigh quotient iterations more e ectively A similar windfall gives Laguerre iterations a new impetus For the symmetric tridiagonal eigenproblem and the bidiagonal SVD problem new Rayleigh quotient and new Laguerre iterations give encouraging results which will be disclosed elsewhere The cost of computing k k k n is O n for a general dense matrix However the complexity is less for structured matrices As an example if the matrix is tridiagonal the cost becomes O n See the ICIAM proceedings and the technical report The same count is true for bidiagonal matrices as well as for diagonal matrices with rank one updates For Toeplitz matrices using the Trench algorithm the operation count becomes O n which could be reduced to O n by using less numerically stable algorithms Computational details are not discussed in this report they will be studied for di erent classes of matrices elsewhere In this paper we do not study the problems due to clustered eigenvalues and defective eigensystems Such details will be analysed elsewhere The structure of this report is organised in the following manner Section describes the notation used while nearly homogeneous systems are analysed in Section In Section the perturbation results are developed The basic eigenvector algorithm is presented in Section In Section historical aspects of nding the best k are summarised We extend the results of Wilkinson in Section and the results of Rutishauser in Section In Section the quality of the computed eigenvalues and the corresponding eigenvectors is veri ed The Rayleigh quotient is easy to compute in our framework and the virtues of this quotient are expounded in Section A new treatment for Laguerre iterations is given in Section Most of the basic results developed in this report can be extended to the generalised eigenvalue problem F G x by replacing the occurrences of F I by F G Notation and Preliminaries Scalars are denoted by lower case Greek characters Eigenvalues are shown as i and singular values as i Vectors are denoted by bold Roman characters such as x y and z The unit vector ek has a one in the kth element and zeros elsewhere The kth element of x is xk the complex conjugate of is and the complex conjugate transpose of x is x Matrices are shown as upper case Roman characters The submatrix of F containing the rows from i to j and columns from k to l is indicated by Fi j k l which is consistent with Fortran and MatLab notations the abbreviation Fi j is used for the submatrix Fi j i j We use the shorthand notation F k l for the matrix obtained by removing the kth row and the lth column of the matrix F Similarly y k is the vector we get if the kth element is omitted If F has linear elementary divisors we use the factorisation F X Y with diagonal and Y X Note that X and Y are not unique We use the notation x k for the kth column of X If F is normal then we choose Y such that Y X The following lemma is easy to prove but we could not trace it in the literature Lemma Let F X Y Y X diagonal If the eigenvalue j is distinct then the product y j i x i j for any i n is invariant over all scalings of X Nearly Homogeneous Systems Suppose that an eigenvalue of the matrix F is known exactly It is well known that the eigenvector x corresponding to is given by the homogeneous system of equations F I x with This system can be solved by assuming an arbitrary non zero value for xk for a particular k k n This is a reasonable assumption since not all components of x can be zero Thus the rest of the solution can be obtained by solving F k I F k k n Fk n k Fk n I x k xk n xk F k k Fk n k The kth equation of is of the form F I k nx However in practice eigenvalues are not known exactly and hence the shift is not an exact eigenvalue of F Thus equation will not be satis ed exactly even if the solution of is known exactly That is instead of we get F I k nx k where the scalar k denotes the residual Thus instead of we are e ectively solving the system of equations F I y k k ek where we have changed the notation from x to y k to emphasise that we are now solving for an approximate eigenvector y k corresponding to the approximate eigen value with the residual k To avoid cluttering we often omit the arguments k and from y k and other similar vectors Lemma If F k k I is non singular then a non zero solution for y F I y k ek k exists if and only if yk is not zero Proof If yk is zero then removing the kth equation from the set of equations we get F k k I y k Since F k k I is non singular y k and hence y which shows the su ciency Suppose that y is non zero but yk is zero Again we get equation which gives y k contradicting our assumption that y is zero Thus a non zero yk is also necessary for a non trivial solution The above result indicates that yk can be taken as a non zero value for the solution of y whenever F k k I is non singular Theorem Let F k k I be non singular and F I z k ek with zk where z z k is a vector which depends on k and Then the perturbed matrix F k eke k I is singular and for a xed k k is unique Moreover is an exact eigenvalue of the perturbed matrix F k eke k and z is the corresponding eigenvector Proof Since zk can be written in the form F I z k ek e kz and hence F k eke k I z Since z is non zero from Lemma F k eke k I is singular The uniqueness of k is due to the fact that det F k eke k I is linear with respect to k This theorem suggests that and k should be chosen such that j k j is as tiny as possible if accurate eigenvectors are required The problem is how to nd a k such that j kj is minimal and tiny which is the main objective of this report The next two results are presented without a proof Corollary If k is zero for a particular k then i is zero for all i Corollary If is an exact eigenvalue of the matrix F and F i i I is non singular then i is zero A Single Element Perturbation to Singularity If F is a non singular matrix and g and h are vectors is it possible to nd a scalar such that F gh is singular This was perhaps answered by Sherman and Morrison see particularly the corollary of the following theorem but we present a more general result It is well known that the Sherman Morrison formula is a special instance of the Woodbury formula However we are studying a perturbation result not the formula itself for updating the inverse of a matrix when a row or column of that matrix is changed See Woodbury and the abstract of Sherman and Morrison Theorem Let F be a non singular matrix G and H be non zero n m and n r matrices respectively and W be anm r matrix If an n r matrix Z such that FZ GW and H Z I then the matrix F GWH is singular Furthermore H F G W I Proof Since H Z I we get FZ GWH Z and hence F GWH Z Because Z is non zero F GWH is singular To prove the second part we note that Z F GW and hence H Z I H F G W An immediate corollary is Corollary If H F G is non singular then F G H F G H is singular In control theory and sometimes in continued fractions theory the matrix H F G is known as the rst block Markov parameter See Kailath and Gantmacher These block Markov parameters are the block elements of Hankel matrices in control theory In scalar form they appear as elements in certain Hankel matrices which give a theoretical basis for qd algorithms See Fernando and Parlett and Henrici Before any further analysis we remind the reader that there is an attainable lower limit to the norm of a perturbation which can make a matrix singular Theorem Gastinel The minimum distance to the set of singular matrices from F is given by jjF jj where jj jj is any operator norm For the spectral norm the above theorem is a direct consequence of the Schmidt Mirsky theorem which is also known as the Eckart Young theorem See Theorem of Stewart and Sun For an arbitrary operator norm see Kahan which attributed the theorem to Gastinel Is it possible to make a matrix singular by changing just one element say the i jth by a scalar say i j The next result is immediate There are many ways to generalise the Sherman Morrison perturbation result to the block case they will be reported elsewhere Corollary If F i jeie j is singular where i j is a scalar then j i j j jjF jj Furthermore j i j j min F The following theorem provides a method to compute i j The corollary of the next theorem is not new Sherman and Morrison discovered this forty ve years ago See also Section of Householder Since then it has been rediscovered many times Demmel stated that it is available in Theorem Let F be a non singular matrix If a non zero vector z exists such that Fz i jei with e jz then F i jeie j is singular Furthermore e jF ei i j Proof By setting G ei H ej and invoking Theorem we arrive at the stated result Corollary Sherman and Morrison If e jF ei is non zero then F e j F ei eie j is singular If F is a covariance matrix then i i is sometimes known as the variance in ation factor See Golub et al We present another useful result without proof Corollary If Fz i jei zj then i p i j zp if zp Remark The above corollary indicates that by computing i i and z all the other i j the elements of the jth column of F can be computed at a cost of n divisions Remark Note that the approximate eigenvector z is invariant for di erent values of i j with xed i Theorem Let F I z k k ek zk The determinant of the matrix F I is then given by det F I k k det F k k I Proof Cramer s rule gives zk k k det F k k I det F I Since zk we get the proposed result The following result is immediate Corollary The eigenvalues of F are given by the zeros of k k k k det F I det F k k I We now present an example to illustrate our theory using a matrix studied by Wilkinson see pages of Example Let F be an upper bidiagonal matrix of order with fi i i fi i and fi j otherwise and It can be easily shown that fF g and hence Thus a slight perturbation to the element F will make the perturbed matrix singular in spite of the fact that the determinant of F is Note that i i i which are the eigenvalues of the matrix and i j for i j The matrix in the following example was kindly provided by Cleve Moler
منابع مشابه
On Variations of Power Iteration
The power iteration is a classical method for computing the eigenvector associated with the largest eigenvalue of a matrix. The subspace iteration is an extension of the power iteration where the subspace spanned by n largest eigenvectors of a matrix, is determined. The natural power iteration is an exemplary instance of the subspace iteration, providing a general framework for many principal s...
متن کاملA Review of Numerical Methods for Computing Point and Interval Estimates by S-PLUS Package
For computing different point estimates such as method of moment and maximum like-lihood estimates and different interval estimates (classical confidence interval, unbi-ased confidence interval, HPD interval), we may deal with the equations which need be solved numerically. In this paper, some numerical methods for solving these type of equations are reviewed in S-PLUS package. Various examples...
متن کاملCenter of Mass Estimation of Simple Shaped Magnetic Bodies Using Eigenvectors of Computed Magnetic Gradient Tensor
Computed Magnetic Gradient Tensor (CMGT) includes the first derivatives of three components of magnetic field of a body. At the eigenvector analysis of Gravity Gradient Tensors (GGT) for a line of poles and point pole, the eigenvectors of the largest eigenvalues (first eigenvectors) point precisely toward the Center of Mass (COM) of a body. However, due to the nature of the magnetic field, it i...
متن کاملImplementation and performance evaluation of new inverse iteration algorithm with Householder transformation in terms of the compact WY representation
A new inverse iteration algorithm that can be used to compute all the eigenvectors of a real symmetric tridiagonal matrix on parallel computers is developed. In the classical inverse iteration algorithm, the modified GramSchmidt orthogonalization is used, and this causes a bottleneck in parallel computing. In this paper, the use of the compact WY representation is proposed in the orthogonalizat...
متن کاملEIGENVECTORS OF COVARIANCE MATRIX FOR OPTIMAL DESIGN OF STEEL FRAMES
In this paper, the discrete method of eigenvectors of covariance matrix has been used to weight minimization of steel frame structures. Eigenvectors of Covariance Matrix (ECM) algorithm is a robust and iterative method for solving optimization problems and is inspired by the CMA-ES method. Both of these methods use covariance matrix in the optimization process, but the covariance matrix calcula...
متن کاملOn Implementation and Evaluation of Inverse Iteration Algorithm with compact WY Orthogonalization
A new inverse iteration algorithm that can be used to compute all the eigenvectors of a real symmetric tri-diagonal matrix on parallel computers is developed. The modified Gram-Schmidt orthogonalization is used in the classical inverse iteration. This algorithm is sequential and causes a bottleneck in parallel computing. In this paper, the use of the compact WY representation is proposed in the...
متن کامل